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RESULT 9 
AR022630 
LOCUS 

DEFINITION 

ACCESSION 

VERSION 

KEYWORDS 

SOURCE 

ORGANISM 

REFERENCE 
AUTHORS 
TITLE 
JOURNAL 

FEATURES 

source 


ORIGIN 


AR022630 
Sequence 
AR022630 
AR022630. 


5 from patent 


1650 bp 
US 5792903. 


DNA 


linear PAT 05 -DEC -199 8 


GI: 3976692 


Unknown . 

Unknown . 

Unclassified. 

1 (bases 1 to 1650) 

Hirschberg, J. , Cunningham, F. Xavier . Jr. and Gantt,E. 
Lycopene cyclase gene 

Patent: US 5792903-A 5 ll-AUG-1998; 
Location/Qualifiers 
1. .1650 

/organism= "unknown" 
/mol_type= "unassigned DNA" 


Alignment Scores: 




Pred. No. : 

3 .77e-261 

Length : 

1650 

Score : 

2622. 00 

Matches : 

499 

Percent Similarity: 

100 .0% 

Conservative: 

1 

Best Local Similarity: 

99.8% 

Mismatches : 

0 

Query Match: 

99.8% 

Indels : 

0 

DB: 

2 

Gaps : 

0 

US -10 -524 -827-18 (1-500) 

x AR022630 

(1-1650) 



Qy 

1 

Db 

112 

Qy 

21 

Db 

172 

Qy 

,41 

Db 

232 

Qy 

61 

Db 

292 

Qy 

81 

Db 

352 

Qy 

101 

Db 

412 

Qy 

121 

Db 

472 

Qy 

141 

Db 

532 


MetAspThrLeuLeuLysThrProAsnAsnLeuGluPheLeuAsnProHisHisGlyPhe 20 

MIIIIIIIIIIIIIIIIIIIMI MIIIIIIMMMIIIIMIMIIIIIIMMIII 

ATGGATACTTTGTTGAAAACCCCAAATAACCTTGAATTTCTGAACCCACATCATGGTTTT 171 


IIIIIIIIIIIIIIIIMMIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIMMIII 

GCTGTTAAAGCTAGTACCTTTAGATCTGAGAAGCATCATAATTTTGGTTCTAGGAAGTTT 


40 


231 


60 


CysGluThrLeuGlyArgSerValCysValLysGlySerSerSerAlaLeuLeuGluLeu 

MIIIIMMIIIIIII INI III IIIIMIIIMIII MM Mil IIIIIIIIIIIMI 

TGTGAAACTTTGGGTAGAAGTGTTTGTGTTAAGGGTAGTAGTAGTGCTCTTTTAGAGCTT 2 91 

ValProGluThrLysLysGluAsnLeuAspPheGluLeuProMetTyrAspProSerLys 80 

I I I I I I I I I 1 1 1 1 I I I I 1 I I I I I I I I I I I I I I I I I I I 1 1 1 1 1 : 1 1 I I I I I I i i 1 1 1 1 1 
GTACCTGAGACCAAAAAGGAGAATCTTGATTTTGAGCTTCCTATCTATGACCCTTCAAAA 351 

*M£T 

Gly Valval ValAspLeuAlaValValGlyGlyGlyProAlaGlyLeuAl aval AlaGln 10 0 

I II II IMIII MM II II II III llllllllllll II MM MIIIMI II MINI II 

GGGGTTGTTGTGGATCTTGCTGTGGTTGGTGGTGGCCCTGCAGGACTTGCTGTTGCACAG 411 
GlnValSerGluAlaGlyLeuSerValCysSerlleAspProAsnProLysLeuIleTrp 12 0 

Ml MM MINIM M M II Ml M M M M M M M M M II M II IIMMM MM 

CAAGTTTCTGAAGCAGGACTCTCTGTTTGTTCAATTGATCCGAATCCTAAATTGATATGG 471 


1 1 1 1 1 M 1 1 M 1 1 1 1 1 1 1 1 M 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

C CT AATAAC TATG GT GT TT GG GTG GATG AATTTG AGGC TATGGACT TG TT AG ATTGTC TA 


140 


531 


1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I II I ! I II 1 1 1 1 1 M I ! 1 1 1 1 1 

G AT GC TA CC TGGT CT GG TG CAGCA GT GT AC AT TG AT GATAAT AC GGCTAAAG AT CTTC AT 


160 


591 
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Qy 

161 

ArgProTyrGlyArgVa lAsnArgLy sGlnLeuLys SerLysMe tMetGl nLysCy si le 

180 



IIIIIIIIMIIIIIIMIIIIIIIIMIIIMIIIIIIIIIIMIIIIIIIIIIIIIII 


UD 


A P A PPTTATPP A A PPPTTA RPPPP A A A C* A P PTP AAA TPP A A A A TO A TP PAPR A A TPTH T A 

A GALL 1 1A1 GGAAGGG1 1 AAL Lbb AAALAbt 1 uA AA 1 L G AAA A 1 bAlbLAbAAAlblAlA 

DDI 

Qy 

181 

MetAsnGlyValLysPheHisGlnAlaLysVallleLysVallleHisGluGluSerLys 

200 



1 i MM II II II M IMI IMM II II II M MM II II II II II II M II II II II II 


nK 

UD 

O D Z 

2i TH a A TPPTPTTa A A TT pp APPA A PPPA A A PTT ATA A APP Tf!! aT TP ATP A PP. A A TPPA A A 

/XX 

Qy 

201 

SerMetLeuIleCysAsnAspGlylleThrlleGlnAlaThrValValLeuAspAl aThr 

220 



IIIIIIIIIMIIIIIIIIIIIIIIIMIIIIMIIIIIIIIIMMIIIIIIMIIM! 


nK 

UD 

"7 1 "~> 

TPP ATPTTP AT ATPP A A TP AT PPT ATT A PT ATTP APPP A A PP PTPP TP PT PP ATPP A A PT 
1 LLA1 bi IbAlAi bLAA 1GA1 1 Al 1AL1 Al 1 L AooL AALGG1 vjVj 1(jc1 LGA1 GLAAL1 

■7*7 1 
/ / 1 

Qy 

221 

GlyPheSerArgSerLeuValGlnTyrAspLysProTyrAsnProGlyTyrGlnValAla 

240 



llllllllllllllllllllllllllllllllllllllllllllllllllllllllllll 


nK 
UD 

n no 
116 

P PP TT PT PT A P A T PT PT TP TT P A P TA TP AT A A PP PTT A T A A C*C* r^f^ PPT A TP A A PTTP PT 

GGL 1 Itl LI AGA1 LI LI 1G1 1 LAG 1A1GA1 AAGLL1 1 A Jl AALLLLGGG 1A X LAAG1 1GL1 

OJ 1 

Qy 

241 

TyrGlylleLeuAlaGluValGluGluHisProPheAspValAsnLysMetValPheMet 

260 



1 ! 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 II 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 M 1 II 1 1 1 1 1 1 1 


nK 

UD 


T ATPP PA TTTTPPPTPA A PTPP A AP APPA PP PPT TT PA TPTA A A PA A P ATPPTTTTPA TP 

1 Al WjLAI ill GGL 1 GAAG 1 vjVjAAb AbLALL LL 111 GA 1G 1 AAALAAGA1 GG 1111 LAlb 

o y i 

Qy 

261 

AspTrpArgAspSerHi sLeuLys AsnAsnThrAspLeuLysGluArgAsnSer Argl le 

280 



1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M ! 1 II 


nK 

UD 

O J £ 

P ATTPPPP A P A TTPTP A TTTP A AP A A PA AT APTP ATPTPA APPAPAPA A ATAPT AP A ATA 
bHl lubL Lxtt.vj.tt. 11L1 LH 1 1 1 U/uivj AALAA1 AL 1 (j A 1 L 1 LAAbbAbAbAAA 1 Ab 1 Ab AA 1 A 

y d 1 

Qy 

281 

ProThrPheLeuTyrAl aMetProPheSerSerAsnArgllePheLeuGluGluThrSer 

300 



lllllllllll IMIIIMIIIII IMIIIIIII IMIIIIIIIIIIMIIIIMIIMI 


nK 
UD 

y 

PP A A PTTTTPTTT A TPP A A TP PP A TTTT PA TP PA APAPPATA TT TP TTP A A P A A APATPA 

LLAAL 11 1 1 LI 11A1 GLAA1GLLA 1111 LA1LLAALAGGA1A1 1 IL 1 1 GAAGAAALATLA 

1011 

Qy 

301 

LeuValAlaArgProGlyLeuArg 11 eAspAspI leGl nGluArgMet Va lAla ArgLeu 

320 



III II II II II II II II II II II III II II II II II II II II II II II II II IMI II II 


nK 

UD 

1U1Z 

PTPPT A P PT PP TP PTPP PTTP PPT ATAPATPATA TTP A AP A A PP A A TP PTPP PT PP TT T A 

L 1LG1 AGL1 LG1LL1 GGL1 1GLG1 Al AGA1GA1A1 1 LAAGAALGAA 1GG1 GGL1 LG 1 1 1A 

10 /l 

Qy 

321 

AsnHisLeuGlylleLysValLysSerlleGluGluAspGluHisCysLeuIleProMet 

340 



IIIIMII II II II IMI II II II III II II II IIIMI II 


Db 

1072 

AACCATTTGGGGATAAAAGTGAAGAGCATTGAAGAAGATGAACATTGTCTAATACCAATG 

1131 

Qy 

341 

GlyGlyProLeuProValLeuProGlnArgValValGlylleGlyGlyThrAlaGlyMet 

360 



Ml II Ml IIIIMII Mill llllll.ll lllllllll IMIIIIM II 


nK 

UD 


p pTpp TP PA PTTP PA PT ATT A PPT PA PAPA PTPPTTPP A A TPPP TP PT A P A P PTPP PA TP 

GG1 GG1LLAL1 1LLAG1 Al 1ALL1 LAGAGAG1 LG 1 1 GGAA1LGG 1GG1 ALAGL1 GGLA1G 

11 yi 

Qy 

361 

ValHisProSerThrGlyTyrMetValAlaArgThrLeuAlaAlaAlaProValValAla 

380 

nK 

UD 

ll^Z 

1 II II M II II II II II II II III II II II M II II II II II II II II II II II M M II 

PTT PA TP PA TP PA PPPPTT AT ATP PTPP PA APPA PA PT 7i fT 1 ^ f* /^TT 1 PT PP TP TT PTTP PP 

Gil LAlLLAlLLALLbb 1 1 Al A1GG1 GGLAAGGALAL1 AGL1 GLGGL1 LL 1G 1 1 Gl 1GLL 

1Z D 1 

Qy 

381 

AsnAlallelleGlnTyrLeuGlySerGluArgSerHisSerGlyAsnGluLeuSerThr 

400 



MM MM 1 II 


nK 

UD 

1Z DZ 

A ATPP PA T A ATTP A A TA PPTPPPTTPTP A A AP A A PTPA TT PP PP T A ATP A ATT A TP PA PA 
AA1 LjLLAIAAI 1LAA1ALL 1LGG1 IL 1 GAAAGAAG1 LAI 1 LGGG 1AA1 GAA1 1A1LLALA 

1j 11 

Qy 

401 

AlaValTrpLysAspLeuTrpProIleGluArgArgArgGlnArgGluPhePheCysPhe 

420 



MIMMMIIIIII 1! IIIIMII II MM IMIIIIM II Ml 


nK 
UD 

li 12 

P PTPTTTPP A A A P A T TT /~<T ^">P PPT ATAPAPAPPA PA PP T/~1 A 7\ 7V P A P A P 'M'l'PTTP rnri ^"lTT/™« 

GL1G1 1 rLTLAAAGATTTGTGGCCTATAGAGAGGAGACGTCAAAGAGAGTTCTTCTGCTTC 

13 71 

Qy 

4 21 

• 

GlyMetAspIleLeuLeuLysLeuAspLeuProAlaThrArgArgPhePheAspAlaPhe 

44 0 



Ml II II II II II II II II II III MM II II II II II II II II II MM IMI IMI II 


Db 

1372 

GGTATGGATATTCTTCTGAAGCTTGATTTACCTGCTACAAGAAGGTTCTTTGATGCATTC 

1431 

Qy 

441 

PheAspLeuGluProArgTyrTrpHisGlyPheLeuSerSerArgLeuPheLeuProGlu 

460 



IIIMIIMIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIII 


Db 

1432 

T TT GA CTTAGAAC CT CG TT AT TGG CATG GC TT CT TATC GT CT CG AT TGTT TCTA CC TG AA 

1491 
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Qy 

461 

LeuIleValPheGlyLeuSerLeuPheSerHisAlaSerAsnThrSerArgPheGluIle 

480 



1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 

I I M 1 M 1 I M M 1 II M 1 1 1 II 1 II 1 1 1 1 II II II II M II 1 1 II M II 1 M I M 1 1 1 I 


Db 

•14 92 

CTC AT AG TT TT TG GG CT GT CT CTATT CT CT CATG CTTC AAAT AC TT CT AG AT TT GAGATA 

1551 

Qy 

481 

MetThrLysGlyThrValProLeuVa lAsnMetl leAsnAsnLeuLeuGlnAspLysGlu 

500 



MIIIIIIMIIIIIIIIIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII 


Db 

1552 

ATGACAAAGGGAACTGTTCCATTAGTAAATATGATCAACAATTTGTTACAGGATAAAGAA 

1611 
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RESULT 1 
AAC65654 

ID AAC65654 standard; DNA; 1608 BP. 
XX 

AC AAC6 56 54; 
XX 

DT 16-FEB-2001 (first entry) 
XX 

DE H. pluvial is carotene hydroxylase DNA. 
XX 

KW Carotene hydroxylase; beta -carotene ; zeaxanthine; canthaxanthine; 

KW astaxanthine ; biocatalyst; xanthophyll; beta-ionone; 4 -keto-beta-ionone; 

KW 3 -hydroxy -bet a -ion one; 3 -hydroxy -4 -keto-beta-ionone; plant; nutrition; 

KW colorant; vitamin A precursor; immunost imulant ; cancer-prevention; 

KW antioxidant; ds. 

XX 

OS Haematococcus pluvialis. 
XX 

FH Key Location/Qualifiers . _ 

FT CDS 1. .971 

FT /*tag= a 

FT /product= "carotene hydroxylase" 

FT /note= "no start codon given" 

XX 

PN DE19916140-A1-. 
XX 

PD 12 -OCT -200 0. 
XX 

PF 09-APR-1999; 9 9DE- 010 16 14 0 . 
XX 

PR 09-APR-1999; 9 9DE - 010 16 14 0 . 
XX 

PA (BADI ) BASF AG. 
XX 

PI Linden H, Sandmann G; 
XX 

DR WPI; 2000-657331/64. 

DR P-PSDB; AAB11111. 
XX 

PT New recombinant algal carotene hydroxylase protein useful for converting 

PT carotenes to xanthophylls, e.g. for use in human and animal nutrition. 
XX 

PS Claim 4; Page 12-14; 18pp; German. 
XX 

CC This invention describes a novel recombinant Haematococcus pluvialis 

CC carotene hydroxylase protein (I) which is capable of converting beta- 

CC carotene to zeaxanthine and/or converting canthaxanthine to astaxanthine. 

CC (I) is useful as a biocatalyst for producing xanthophylls by converting a 

CC beta-ionone structural element into a 3 -hydroxy -beta- ionone structural 

CC element and/or converting a 4 -keto-beta- ionone structural element into a 

CC 3 -hydroxy-4 -keto-beta- ionone structural element. Nucleic acids encoding 

CC (I) are useful for producing genetically modified organisms, especially 

CC plants, exhibiting increased expression of xanthophylls. Xanthophylls are 

CC important in human and animal nutrition as colorants and vitamin A 

CC precursors, have health-promoting (e.g. immunost imulant ) properties, and 

CC have cancer-preventing antioxidant activity 

XX 

SQ Sequence 1608 BP; 327 A; 414 C; 513 G; 354 T; 0 U; 0 Other; 

Query Match 100.0%; Score 1608; DB 3; Length 1608; 

Best Local Similarity 100.0%; Pred. No. 0; 
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Matches 1608; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 CTACATTTCACAAGCCCGTGAGCGGTGCAAGCGCTCTGCCCCACATCGGCCCACCTCCTC 60 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I 
Db 1 CTACATTTCACAAGCCCGTGAGCGGTGCAAGCGCTCTGCCCCACATCGGCCCACCTCCTC 60 

Qy 61 ATCTCCATCGGTCATTTGCTGCTACCACGATGCTGTCGAAGCTGCAGTCAATCAGCGTCA 12 0 

I II I I I I I I I I I I I I I I I I II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I M 
Db 61 ATCTCCATCGGTCATTTGCTGCTACCACGATGCTGTCGAAGCTGCAGTCAATCAGCGTCA 12 0 

Qy 121 AGGCCCGCCGCGTTGAACTAGCCCGCGACATCACGCGGCCCAAAGTCTGCCTGCATGCTC 18 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 121 AGGCCCGCCGCGTTGAACTAGCCCGCGACATCACGCGGCCCAAAGTCTGCCTGCATGCTC 18 0 

Qy 181 AGCGGTGCTCGTTAGTTCGGCTGCGAGTGGCAGCACCACAGACAGAGGAGGCGCTGGGAA 24 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 181 AGCGGTGCTCGTTAGTTCGGCTGCGAGTGGCAGCACCACAGACAGAGGAGGCGCTGGGAA 24 0 

Qy 241 CCGTGCAGGCTGCCGGCGCGGGCGATGAGCACAGCGCCGATGTAGCACTCCAGCAGCTTG 30 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 2 41 CCGTGCAGGCTGCCGGCGCGGGCGATGAGCACAGCGCCGATGTAGCACTCCAGCAGCTTG 30 0 

Qy 3 01 ACCGGGCTATCGCAGAGCGTCGTGCCCGGCGCAAACGGGAGCAGCTGTCATACCAGGCTG 36 0 

III IMIIIIIIMIII lllllll llllllllllllll llllllll IMIIMIIMIII 

Db 3 01 ACCGGGCTATCGCAGAGCGTCGTGCCCGGCGCAAACGGGAGCAGCTGTCATACCAGGCTG 36 0 

Qy 3 61 CCGCCATTGCAGCATCAATTGGCGTGTCAGGCATTGCCATCTTCGCCACCTACCTGAGAT 42 0 

I II INI II II II II II II II III II II II II II II II II II II MM II II II II II II 
Db 3 61 CCGCCATTGCAGCATCAATTGGCGTGTCAGGCATTGCCATCTTCGCCACCTACCTGAGAT 42 0 

Qy 4 21 TTGCCATGCACATGACCGTGGGCGGCGCAGTGCCATGGGGTGAAGTGGCTGGCACTCTCC 480 

III II II II II II II II MM II M II II II II II III MM II II II II II II II II II 
Db 4 21 TTGCCATGCACATGACCGTGGGCGGCGCAGTGCCATGGGGTGAAGTGGCTGGCACTCTCC 48 0 

Qy 4 81 TCTTGGTGGTTGGTGGCGCGCTCGGCATGGAGATGTATGCCCGCTATGCACACAAAGCCA 54 0 

I II II II II II II II II M II III II M M II II M II II II M II II II II II II II II 

Db 4 81 TCTTGGTGGTTGGTGGCGCGCTCGGCATGGAGATGTATGCCCGCTATGCACACAAAGCCA 54 0 

Qy 541 T CT GG CATG AGTCGC CT CT GG GCT GG CT GC TG CACAAG AG CC AC CA CA CA CC TC GC AC TG 60 0 

I M II II II II II II II MM III II II II II II M II M II II M II II II M II II II 

Db 541 TCTGGCATGAGTCGCCTCTGGGCTGGCTGCTGCACAAGAGCCACCACACACCTCGCACTG 60 0 

Qy 6 01 GACCCTTTGAAGCCAACGACTTGTTTGCAATCATCAATGGACTGCCCGCCATGCTCCTGT 66 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I 
Db 6 01 GACCCTTTGAAGCCAACGACTTGTTTGCAATCATCAATGGACTGCCCGCCATGCTCCTGT 66 0 

Qy 6 61 GTACCTTTGGCTTCTGGCTGCCCAACGTCCTGGGGGCGGCCTGCTTTGGAGCGGGGCTGG 72 0 

I II II II II II II II II II II Ml II II M MM M II II II M II M II M II II II II 

Db 6 61 G T A CC TT TG GC TT CT GG CT GC C C A AC GT CC TG GG GG CG GC CT GC TT TG GAGC GG GG CT GG 72 0 

Qy 7 21 GCATCACGCTATACGGCATGGCATATATGTTTGTACACGATGGCCTGGTGCACAGGCGCT 78 0 

I II II II II II II II M II II III II II M II II M II II II II II II II II II II II II 

Db 721 GCATCACGCTATACGGCATGGCATATATGTTTGTACACGATGGCCTGGTGCACAGGCGCT 78 0 

Qy 7 81 TTCCCACCGGGCCCATCGCTGGCCTGCCCTACATGAAGCGCCTGACAGTGGCCCACCAGC 84 0 

I II II M II II II II II II II III II II II MM II II II II II II II II II II II II II 

Db 7 81 TTCCCACCGGGCCCATCGCTGGCCTGCCCTACATGAAGCGCCTGACAGTGGCCCACCAGC 84 0 

Qy 8 41 TACACCACAGCGGCAAGTACGGTGGCGCGCCCTGGGGTATGTTCTTGGGTCCACAGGAGC 90 0 

I II II II II II II II INI II III II II II II II II II II II II II II II II II Mil II 
Db 8 41 TACACCACAGCGGCAAGTACGGTGGCGCGCCCTGGGGTATGTTCTTGGGTCCACAGGAGC 90 0 
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Qy 9 01 TGCAGCACATTCCAGGTGCGGCGGAGGAGGTGGAGCGACTGGTCCTGGAACTGGACTGGT 96 0 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I II I I 
Db 9 01 TGCAGCACATTCCAGGTGCGGCGGAGGAGGTGGAGCGACTGGTCCTGGAACTGGACTGGT 96 0 

Qy 961 C CAAG CG GT AG GG TG CG GAAC CAG GC AC GC TG GT TT CA CA CC TC AT GC CT GTGATAAG GT 1020 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I 
Db 961 C CAAG CG GT AG GG TG CG GAAC CAG GC AC GCTG GT TT CA CA CC TC AT GC CT GTGATAAG GT 1020 

Qy 1021 GTGGCTAGAGCGATGCGTGTGAGACGGGTATGTCACGGTCGACTGGTCTGATGGCCAATG 1080 

I II II II II II II II II II II III MM II II II II II II II II II II II II II II II II 
Db 1021 GTGGCTAGAGCGATGCGTGTGAGACGGGTATGTCACGGTCGACTGGTCTGATGGCCAATG 1080 

Qy 1081 GCATCGGCCATGTCTGGTCATCACGGGCTGGTTGCCTGGGTGAAGGTGATGCACATCATC 1140 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1081 GCATCGGCCATGTCTGGTCATCACGGGCTGGTTGCCTGGGTGAAGGTGATGCACATCATC 1140 

Qy 1141 ATGTGCGGTTGGAGGGGCTGGCACAGTGTGGGCTGAACTGGAGCAGTTGTCCAGGCTGGC 1200 

II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 1141 ATGTGCGGTTGGAGGGGCTGGCACAGTGTGGGCTGAACTGGAGCAGTTGTCCAGGCTGGC 12 00 

Qy 12 01 G TT GAAT CAGT GAGGGT TTGT GAT TGGC GGTT GT GAAG CAAT GA CT CC GC CC AT AT TC TA 12 60 

I II II II II II II II II II II III II MM MM II II II II II II II II II II II II II 

Db 12 01 G TT GAAT CAGT GAGG GT TTGT GAT TGGC GGTT GT GAAG CAAT GACT CC GC CC AT AT TC TA 12 60 

Qy 12 61 TTTGTGGGAGCTGAGATGATGGCATGCTTGGGATGTGCATGGATCATGGTAGTGCAGCAA 13 20 

I II M II II II II II II M II III II II MM II M II II II II II II M M II II II II 

Db 1261 TTTGTGGGAGCTGAGATGATGGCATGCTTGGGATGTGCATGGATCATGGTAGTGCAGCAA 1320 

Qy 1321 ACTATATTCACCTAGGGCTGTTGGTAGGATCAGGTGAGGCCTTGCACATTGCATGATGTA 13 80 

I II II II II II II II II II II Ml MM II M II II II II II II II II II II II II II II 

Db 13 21 ACTATATTCACCTAGGGCTGTTGGTAGGATCAGGTGAGGCCTTGCACATTGCATGATGTA 13 80 

Qy 13 81 C TC GT CATG GT GTGT TG GTGAGAG GATG GATG TG GATGGATG TG TATT CT CAGA CG TAGA 14 4 0 

I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I 
Db 13 81 C TC GT CATG GT GTGT TG GTGAGAG GATG GATG TG GATGGATG TG TATT CT CAGA CG TAGA 14 4 0 

Qy 14 41 CCTTGACTGGAGGCTTGATCGAGAGAGTGGGCCGTATTCTTTGAGAGGGGAGGCTCGTGC 15 00 

I II II II II II II II II II II III II II M II II II II II II II II II II II MINI II 

Db 14 41 C CT TG AC TG GAGG CT TG AT CG AGA GAGT GGGC CG TATT CT TT GAGAGG GG AGGC TCGT GC 15 00 

Qy 15 01 CAGAAATGGTGAGTGGATGACTGTGACGCTGTACATTGCAGGCAGGTGAGATGCACTGTC 15 60 

I II II II II II II II II II II III II II II II II II II II II II II II II II MM II II 
Db 15 01 CAGAAATGGTGAGTGGATGACTGTGACGCTGTACATTGCAGGCAGGTGAGATGCACTGTC 15 60 

Qy 1561 T CG ATTGTAAAAT AC ATTC AGATG CAAAAAAAAAAAAAAAAAAAAAAA 1608 

I M II II II M II II II II Mill II M MM II II II II II II II I 

Db 1561 T CG AT TG TAAA AT AC AT TC AG ATG CAAAAAAAAAAAAAAAAAAAAAAA 1608 
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RESULT 1 
AF2204 

beta-carotene ketolase [imported] - Nostoc sp . (strain PCC 7120) 
C;Species: Nostoc sp. PCC 7120 

A;Note: Nostoc sp . strain PCC 7120 is a synonym of Anabaena sp . strain PCC 7120 
C;Date: 14-Dec-2001 #sequence_revision 14-Dec-2001 #text_change 09-Jul-2004 
C;Accession: AF2204 

R;Kaneko, T.; Nakamura, Y . ; Wolk, CP.; Kuritz, T.; Sasamoto, S.; Watanabe, A.; Iriguc 
DNA Res. 8, 205-213, 2001 

A;Title: Complete Genomic Sequence of the Filamentous Nitrogen-fixing Cyanobacterium A 

A/Reference number: AB1807; MUID: 21595285; PMID : 11759840 

A;Accession: AF2204 

A; Status: preliminary 

A;Molecule type: DNA 

A;Residues: 1-258 

A;Cross-references: UNIPROT : Q8YSA0 ; UNIPARC : UPI00000CE6D5; GB:BA000019; PIDN :BAB74888 . 
A/Experimental source: strain PCC 7120 
C;Genet ics : 
A;Gene: alr3189 

C;Superfamily : beta-carotene ketolase ....... 

Query Match 100.0%; Score 1439; DB 2; Length 258; 

Best Local Similarity 100.0%; Pred. No. 2.9e-116; 

Matches 25 8; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

MVQCQPSSLHSEKLVLLSSTIRDDKNINKGIFIACFILFLWAISLILLLSIDTSIIHKSL 60 

I II MM II II II II II II II III II II II II II II MM II MM II MIMI M III! 

MVQCQPSSLHSEKLVLLSSTIRDDKNINKGIFIACFILFLWAISLILLLSIDTSIIHKSL 60 
LGI AMLWQT FLYTGLFITAHDAMHGVVYPKNPRINNFIGKLTLILYGLLPYKDLLKKHWL 12 0 

I II II II II II II II II II II III II II II II II M II II II M II II II II II II II II 

LGI AMLWQT FLYTGLFITAHDAMHGVVYPKNPRINNFIGKLTLILYGLLPYKDLLKKHWL 12 0 
HHGHPGTDLDPDYYNGHPQNFFLWYLHFMKSYWRWTQI FGLVMI FHGLKNLVHI PENNLI 18 0 

Ml II II II II II II II II II III II II II II II II II II II II MM MM II II II II 


I II MM II II M II II II M III II II II II II II II II II II MIMI MM II II II 


Qy 

l 

Db 

l 

Qy 

61 

Db 

61 

Qy 

121 

Db 

121 

Qy 

181 

Db 

181 

Qy 

241 

Db 

241 


1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 


RESULT 2 
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